You may not have noticed but most of the situations where we’re asked to identify objects are roads, things that travel on roads, things that go over roads, things that are on roads, people near roads. Ever wondered why?
The idea behind it all was that robots would not be able to read a bunch of letters and numbers that were printed in a distorted format, but humans would. Bots are good at logic. Humans are good at abstract. So researchers devised a program that would display a bunch of garbled codes for humans to type in and gain access to a website they wished to secure. This program, called captcha was introduced in early 2000s by Carnegie Mellon University researchers, lead by computer scientist researcher Luis Von Ahn.
The first captcha codes were simple. You retyped the letters and numbers that appeared in an image to gain access to a secure site, thus proving you weren’t a spambot. Then in 2006 Ahn had the idea of using captcha for deciphering old blurry text in archival texts. “So we asked, ‘Can we do something useful with this time?’,” he told the New York Times in an interview. And thus recaptcha was born.
Google bought the company in 2009 and realizing the potential, used the web’s hive mind to collect data through recaptcha that became so accurate its AI (artificial Intelligence) systems could distinguish between a Doberman and a Golden Retriever.
Then it realized it could use the data from billions of these image-based puzzles in the AI controlling its self-driving cars. In 2014, Google introduced a modified version of recaptcha — no captcha ReCaptcha— to replace the old recaptcha codes. And we all helped provide the data that would avoid running down pedestrians or whacking a fire hydrant by identifying these hazards through blurry images.
The important bit about the AI systems is that they need vast amount of data to practice and learn. The more the data, the better and more accurate they get. Google has a dedicated team working on improving the accuracy of the AI systems running at the core of Waymo cars. However, even, those teams cannot create the amount of data, or co-related data, that millions of online users can generate. This co-related data can then be used for training its AI systems for things ranging from identifying street signs to identifying an animal and from detecting the season of year to detecting the object in front. This in turn can be used for improving the Waymo car’s decision making capabilities to access and respond in various real-life situations, like how it has to react in case there’s a cow in its way or if a tree blocks its way. (More here.)
Recently, web infrastructure company Cloudflare estimated that humanity collectively spends 500 years of labor each day on CAPTCHAs. And while that’s not just inefficient for us, it turns out that while we’re proving we’re not a robot, we’re teaching robots to think like us.
Google is not the only company utilises user-generated data to train its algorithms and machine learning systems. Companies like Facebook, Microsoft and Amazon also do it. When you upload an image on Facebook, that image is used by Facebook’s systems for training purposes. Facebook has announced that it has built an artificial intelligence program that can “see” what it is looking at. It did this by feeding it over 1 billion public images from Instagram. The “computer vision” program, nicknamed SEER, outperformed existing AI models in an object recognition test, Facebook said.